{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Simple data representations\n",
    "\n",
    "Before we delve into learnable data representations, feature crosses, etc., let’s look at simpler data representations. We can think of these simple data representations as common idioms in machine learning -- not quite patterns, but commonly employed solutions nevertheless."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Scaling helps\n",
    "\n",
    "Models trained with scaled data converge faster and are therefore faster/cheaper to train."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn import datasets, linear_model\n",
    "diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "raw = diabetes_X[:, None, 2]\n",
    "\n",
    "max_raw = max(raw)\n",
    "min_raw = min(raw)\n",
    "scaled = (2*raw - max_raw - min_raw)/(max_raw - min_raw)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Raw: 0.3075s, Scaled: 0.2774s, Improvement: 9.792116%\n"
     ]
    }
   ],
   "source": [
    "def train_raw():\n",
    "    linear_model.LinearRegression().fit(raw, diabetes_y)\n",
    "\n",
    "def train_scaled():\n",
    "    linear_model.LinearRegression().fit(scaled, diabetes_y)\n",
    "\n",
    "import timeit\n",
    "raw_time = timeit.timeit(train_raw, number=1000)\n",
    "scaled_time = timeit.timeit(train_scaled, number=1000)\n",
    "print('Raw: {:.4f}s, Scaled: {:.4f}s, Improvement: {:2f}%'\n",
    "      .format(raw_time, scaled_time, 100*(raw_time-scaled_time)/raw_time))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Numerical inputs\n",
    "\n",
    "One key predictor of the weight of a baby is the mother's age. We can verify this by looking at the average weight of a baby born to mothers with different ages. Since the dataset is large enough, we will do the computation in BigQuery:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bigquery df\n",
    "SELECT\n",
    "  mother_age,\n",
    "  COUNT(1) AS num_babies,\n",
    "  AVG(weight_pounds) AS avg_wt\n",
    "FROM\n",
    "  publicdata.samples.natality\n",
    "WHERE\n",
    "  year > 2000\n",
    "GROUP BY mother_age\n",
    "ORDER BY mother_age"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAELCAYAAAAiIMZEAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzt3Xl8VOX1+PHPmSSTnYSsLGFJ2MMOARUQUdSKsohaRdtqrS3SqrX9dtHan23t3q9+22pbtbh207qBe1GxorgAguwJW1hDIAkEyL6f3x8zwRCyTNY7Sc779coL5s7NvYcLnDxz7nPPI6qKMcaY7sXldADGGGPanyV3Y4zphiy5G2NMN2TJ3RhjuiFL7sYY0w1ZcjfGmG7IkrsxxnRDzSZ3ERkhIpvqfBWIyHca2G+W9/3tIvJ+x4RrjDHGF9KSh5hEJAA4DJyjqgfqbI8GPgYuU9WDIpKgqrntHq0xxhifBLZw/9lAZt3E7nUDsExVDwL4ktjj4uJ08ODBLTy9Mcb0bBs2bDimqvHN7dfS5L4IeLaB7cOBIBFZBUQCD6rq35s60ODBg1m/fn0LT2+MMT2biNQfXDfI5+QuIm5gPvCjRo4zGc/IPhT4RETWqOquesdYDCwGGDhwoK+nNsYY00ItmS0zB/hMVXMaeC8LeEtVi1X1GPABML7+Tqq6VFXTVDUtPr7ZTxXGGGNaqSXJ/XoaLskAvALMEJFAEQkDzgEy2hqcMcaY1vGpLCMi4cAlwK11ti0BUNVHVTVDRFYAW4Aa4HFV3dYB8RpjupnKykqysrIoKytzOhS/EhISQlJSEkFBQa36/hZNhWxPaWlpajdUjTH79u0jMjKS2NhYRMTpcPyCqnL8+HEKCwtJTk4+4z0R2aCqac0dw55QNcY4qqyszBJ7PSJCbGxsmz7NWHI3xjjOEvvZ2npNHEvu+cUVTp3aGGO6PceS+5FTdvPEGNOzvfzyy6Snp3fIsR1L7jWq1NTY4tzGmJ6rWyZ3gNLKaidPb4wxAFx55ZVMnjyZ0aNHs3TpUh599FF+8IMfnH7/6aef5vbbbwfgF7/4BSNGjGDGjBlcf/31PPDAAw0eMzc3l8mTJwOwefNmRISDBw8CMGTIED7++GNeffVVfvCDHzBhwgQyMzPb9c/U0t4y7aqkoprwYEdDMMb4kfte2056dkG7HjO1Xy9+Om90k/s8+eSTxMTEUFpaypQpU3j33XeZPn06999/PwDPPfccP/7xj/n000956aWX2Lx5M5WVlUyaNOl0Aq8vISGBsrIyCgoKWL16NWlpaaxevZoZM2aQkJDAtGnTmD9/PnPnzuWaa65p1z8zOJzcSyts5G6Mcd5DDz3E8uXLATh06BD79u0jJSWFNWvWMGzYMHbs2MH06dN58MEHWbBgASEhIYSEhDBv3rwmjztt2jQ++ugjPvjgA+655x5WrFiBqnL++ed3+J/J2ZF7ZZWTpzfG+JnmRtgdYdWqVaxcuZJPPvmEsLAwZs2aRVlZGYsWLeL5559n5MiRLFy4sFVTE2fOnMnq1as5cOAACxYs4He/+x0iwhVXXNEBf5IzOVpzL7GRuzHGYadOnaJ3796EhYWxY8cO1qxZA8DChQt55ZVXePbZZ1m0aBEA06dP57XXXqOsrIyioiJef/31Jo99/vnn889//pNhw4bhcrmIiYnhzTffZMaMGQBERkZSWFjYIX8uZ2+oWnI3xjjssssuo6qqilGjRnH33Xdz7rnnAtC7d29GjRrFgQMHmDp1KgBTpkxh/vz5jBs3jjlz5jB27FiioqIaPfbgwYNRVWbOnAnAjBkziI6Opnfv3gAsWrSI+++/n4kTJ7b7DVXHessE9x2mr7/7IZekJjpyfmOMf8jIyGDUqFFOh+GzoqIiIiIiKCkpYebMmSxdupRJkyZ1yLkauja+9pZxeLaM1dyNMV3L4sWLSU9Pp6ysjJtuuqnDEntbOT4V0hhjupJnnnnmrG233XYbH3300Rnb7rzzTm6++ebOCussltyNMaaN/vKXvzgdwlkcvqFqZRljjKd/uTlTW69Js8ldREaIyKY6XwUi8p1G9p0iIlUi4tPjVjZyN8aEhIRw/PhxS/B11C7WERIS0upjNFuWUdWdwAQAEQkADgPL6+/nfe93wNu+nNglYsndGENSUhJZWVnk5eU5HYpfqV1mr7VaWnOfDWSq6oEG3rsDeAmY4suBXGLz3I0xEBQUdNZScqbtWlpzXwQ8W3+jiPQHFgKP+Hxil1BiXSGNMaZD+JzcRcQNzAdeaODtPwJ3qWpNM8dYLCLrRWR9TXW13VA1xpgO0pKyzBzgM1XNaeC9NODf3sY6ccDlIlKlqi/X3UlVlwJLAXoPGqlWczfGmI7RkuR+PQ2UZABU9XTBTESeBl6vn9jrsxuqxhjTcXwqy4hIOHAJsKzOtiUisqTVJxax9gPGGNNBfBq5q2oxEFtv26ON7PtVX47pctk8d2OM6SiOPaHqErGpkMYY00Ec6y1jNXfjhPKqanILysktLCe/uIL84nLyiyvJLy7neHEF+cUVnCyppEYVARBBPL8AIECASwgODMAd6CI40FXv1wCGxEdwbkoMyXHhrVq9x5j24GByh9LKampqFJfL/gOY9lFYVklmXjH7jxWTfaqUo6fKOHKqzPtrKceKKhr8vpAgF7HhwcSEu4kOCyLA+29SFZQz+3xUVSulldWcKq2kvKqaiqoayr1fpRXVlHqf30jsFcy5KbGnvwbHhlmyN53G0ZE7QFlVNWFuR5tTmi7oRHEFO44WsieviMzcIvZ4v44WlJ2xX6+QQPpFh9InKoQx/XvRp1cofaNDiI8MJjbcTYz3q73+Daoq+44Vs2ZvPp/sPc7Hmcd5ZVM2AH16hXBOSgznJMcyNbk3Q+IjLNmbDuNccncJiuemqiV30xxVZU9uEe9k5LAyPYeNh05SO5gOdwcwNCGCaUNiGZIQwdCECIbEh9M3KpTw4M79tyUipMRHkBIfwQ3nDERV2XusmDV7j7Nmb/4ZyT423M2UwTFMSY7hnOQYRvXtdfoTgzFt5WhZphrrL2MaV1ldw6f781mZnsu7O3I4cLwEgLH9o7hz9jAmD+rN0IQI+vQK8dsRsIgwJD6CIfERfOmcQagq+4+XsG7fcdbtO8G6/cdZsf0oAJHBgUxJjmHG0DjOHxbH0AQb2ZvWc7QsU41NhzSfU1V25xbxSeZx1uw9zkd7jlFQVoU70MX0IbEsnpnC7JGJ9IlqfRtUp4kIyXHhJMeFc92UgQAcOVXKun35rN2XzyeZx/nvjlzAU8aZMcyT6KcPjSMuItjJ0E0X43jNvdgeZOqxakstn+z1JPO1e/M5Xuy54dk/OpQvjO7DxamJnD8srluX7vpGhbJgQn8WTOgPQNaJEj7cfYzVu4+xMiOHFzdkAZDatxdXjOvLwon96Rcd6mTIpgtwtCwDVpbpaVSVTYdO8vLGw7yx9SjHisoB6BcVwgUj4jk3JZbzUmIZEBPmcKTOSeodxqKpA1k0dSDVNcr27FOs3n2MVTtzuf+tnTzw9k6mD4nj6sn9uWx0X0LdAU6HbPyQ4yN3K8v0DPuPFfPypsO8vPEw+4+X4A50cfGoBGYNT+DclFgGxIRafbkBAS5hXFI045Kiue3CoRw8XsJLn2WxbGMW331uM/cGb+fysX24elISU5Nj7Bqa0xydLQNYf5lu7ERxBa9vyWbZxsNsPHgSETg3OZZvzRrKZWP70CskyOkQu5yBsWF895Lh3Dl7GOv25/PShize2HKE59dn0S8qhHNSYkkb3Jupg2MYEh9hz5D0YFaWMe1uV04hT320j2WfHaa8qoYRiZHcPWck88f3s1pxO3G55PTDUfctGM2KbUd5Jz2H1buPsXzjYQCiw4JIG9SbtMExTBncm3FJ0QQFONZxxHQyK8uYdlFTo3ywO48nPtzH6t3HCA50cdWkJL5y7iBS+/VyOrxuLcwdyFWTkrhqUhKqyoHjJazbn8/6/fms33+ClRme2Td9o0L42vRkrps6wD419QCOJ/dSW2qvSyutqGbZxiye/HAfmXnFJEQG8/1Lh3PDOYOICXc7HV6PIyIMjgtncFw416YNAOBYUTlr9h7nX2sO8qs3M3jw3d1cP3UAN09Ptk9S3ZhjyV3Ec7PIau5dU0lFFU99tJ/HVu/lZEklY/r34g/XjeeKsf1wB9pHf38SFxHM3HH9mDuuH1uzTvHY6r08+dF+nvpoP3PH9eXr56cwpn+U02Gadubo5OGwoAAry3QxFVU1PPfpQR58dw/Hisq5aGQCt85MsZkaXcTYpCgeun4iP7xsBE99tJ9/rzvIy5uyOS8llqsm9efS0X2ICrWSTXcgdbvddaa0tDR1LfwtF41M4LdXj3MkBuO7mhrl1c3Z/P6dXRzML2Hq4BjumjOCyYNinA7NtMGp0kr+ve4g/1hzgKwTpbgDXMwcHscV4/py8ahEIq0273dEZIOqpjW3X7MjdxEZATxXZ1MK8BNV/WOdfb4E3IWn3XUh8E1V3dzcscPcARTbyN2vqSrv7czlf1fsZMfRQlL79uKpm6cwa3i8jdS7gajQIG69YAiLZ6awOesUr2/O5o2tR1iZkYs70MWs4fGnE31nN2EzbdPs35aq7gQmAIhIAHAYWF5vt33ABap6QkTmAEuBc5o7dqg7kFKrufutdfvyuf+tHXy6/wSDY8N46PqJzB3b1+ZOd0MiwoQB0UwYEM09l49i46GTvL4lmze3HuHt9Bxiw93cNWck10xKsr//LqKlP4pnA5mqeqDuRlX9uM7LNUCSLwcLc1vN3R9tO3yK+9/ayfu78kiIDOaXV47huikDbI50D+FyCZMH9WbyoN7ce0Uq6/bnc/9bO/nhi1t47tND/HzBaEb3sxuw/q6lyX0R8Gwz+9wC/MeXg4W5Aygss5G7v9iTW8jv39nFm1uPEh0WxD2Xj+TG8wYTEmS9S3qq2oelXrj1PF76LIvf/mcH8/70ITeeN5j/uXS4zZf3Yz4ndxFxA/OBHzWxz4V4kvuMRt5fDCwGGDhwIJOCAsgtKG9RwKb9Hcov4cF3d7PssyxCgwK4c/Ywbjk/2f7jmtNcLuGLaQO4NLUPD7y9k799sp/XtxzhnstHsnBif7v/4odaMnKfA3ymqjkNvSki44DHgTmqeryhfVR1KZ56PGlpaRoeHEhJpY3cnVJUXsWDK3fx9Mf7ERFumZHMkguGEGt9w00josKC+MWVY7g2bQD3vrKN/3l+M/9ed4hfXzWWoQkRTodn6mhJcr+eRkoyIjIQWAZ8RVV3+XrAUHeA9ZZxgKqyYttR7nstnaMFZVyXNoDvXDKMvlH2tKLxzdikKJZ9cxrPrz/Eb1fsYO6fVvPTeaNZNGWAjeL9hE/JXUTCgUuAW+tsWwKgqo8CPwFigYe9f7FVvszDtIeYOt/B4yX89NVtvLczj1F9e/HwlycxaWBvp8MyXZDLJSyaOpALRybwP89v4kfLtvL+zjx+e/VYosOs9YTTfEruqlqMJ3nX3fZond9/Hfh6S08e5g6gtLKamhq16VUdrLyqmsc+2Muf/ruHQJdw79xUbjpvEIE2A8a0UWKvEP7xtXN4bPVeHnh7J5f98SR/uG4C5w2Jbf6bTYdx9H92qDsQVSirstF7R/o48xiXP7iaB97exexRCaz83gXcMiPZErtpNy6XcOsFQ1j2zemEuQO44fE1/O+KHVRW1zgdWo/l6P/uMO/yYFaa6RhlldX85JVt3PDYWiqrladunsLDX5pstXXTYcYmRfH6t2dwXdoAHl6VyTWPfsL+Y8VOh9UjOTxy9yR3u6na/tKzC5j3pw/5+ycHuGVGMm9/dyYXjkhwOizTA4S5A/nt1eN45EuT2H+smHl//pBD+SVOh9Xj2Mi9m6mpUR5fvZcr//IRp0or+cctU7l3bqo9iGQ63ZyxfXn19umg8L3nN1Nd40yTwp7KT5K7zXVvDzkFZdz01Dp++UYGF4yIZ8V3ZnL+sHinwzI92KDYcH42fzTr9ufzxId7nQ6nR3G2n7vbc3ory7TdW9uPcvdLWyitrObXC8dy/VSbb2z8w1WT+vN2+lEeeGsXFwxPYESfSKdD6hH8ZORuyb21jhWV84MXNnPrPzbQv3cor99xPjecM9ASu/EbIsKvF46lV2gg331uExVVNoOmM/hHcrd1VFusvKqav76fyaz7V7F842G+OcszDc0eATf+KDYimN9cNY70IwU8+K7PD7GbNnC0LBN6uixjNXdfqSpvbc/h129mcDC/hItGJvDjK0YxJN6SuvFvl6Qmcm1aEo+syuSikYlMHmRPRnckZ0fuQVaWaYn07AJueGwtS/65geBAF3/72lSe/OoUS+ymy7h3bip9o0L53vObbCJFB/OLee6W3JuWX1zBj5Zt4Yo/rWbH0QJ+sWA0/7nzfC4YbjNhTNcSGRLE/107ngP5JfzmzR1Oh9OtOVqWCQ504RKbCtmUdzNyuOulrZwsqeBr05P59kXDiAqzPuum6zo3JZZbpifz+If7uDg10QYpHcTRkbuIEOYOtJF7A4rKq7j7pS3c8rf1xEW4ee2OGdw7N9USu+kWvv+FEQxLiOCHL27mZEmF0+F0S453jrKe7mf7dH8+cx78gOfWH2LJBUN45fbpjOrby+mwjGk3IUEB/OG6CRwvqrDyTAdxPLnbItmfK6+q5rf/2cG1f/0EgOdvPY+754wkONBaB5juZ0z/KL587iBe+iyLwydLnQ6n2/GD5G5lGYCMIwUs+PNHPPp+JtelDeA/d85kyuAYp8MypkMtnpkCwGMfWGuC9tZscheRESKyqc5XgYh8p94+IiIPicgeEdkiIpN8DcCzYEfPvqH6xpYjLPjLRxwrKufxG9P47dXjiAh29F63MZ2iX3QoV03qz7PrDnKsqNzpcLqVZpO7qu5U1QmqOgGYDJQAy+vtNgcY5v1aDDziawA9uSyjqjz2wV5ue+YzxvaPYsV3ZnJxaqLTYRnTqZZcMISK6hqe/HCf06F0Ky0ty8wGMlX1QL3tC4C/q8caIFpE+vpywNCgnnlDtbpG+dmr2/nVmxlcPrYP//r6OcRFBDsdljGdLiU+gsvH9OUfnxygoKzS6XC6jZYm90XAsw1s7w8cqvM6y7utWWHuAIp72Dz30opqlvxzA3/75ADfOD+ZP18/yfqtmx7tm7OGUFhexT8+qT9uNK3lc3IXETcwH3ihtScTkcUisl5E1ufl5QGe/jI9aeR+rKicRY+tYWVGDj+bl8qPr0i1xcFNjzemfxSzRsTz5If7elQ+6EgtGbnPAT5T1ZwG3jsMDKjzOsm77QyqulRV01Q1LT7e81RaT6q5Z+YVcdXDH7PzaAF//fJkvjo92emQjPEb35o1lOPFFTy//lDzO5tmtSS5X0/DJRmAV4EbvbNmzgVOqeoRXw7qmS1TjWr3XoLrg115XP3IxxSXV/HsN87l0tF9nA7JGL8yNTmGKYN789f3M63nezvwKbmLSDhwCbCszrYlIrLE+/JNYC+wB3gM+JavAYS6A1CFssru+Ze5bl8+1y9dw41PriMmzM2yb01j4kBrdWpMQ7514VCyT5XxyqazPvibFvJpMrWqFgOx9bY9Wuf3CtzWmgDCvT3dSyqqTneJ7A42HDjBH1fuYvXuY8RFBPOTuanccM5Au3FqTBNmDY8ntW8vHnk/k6smJRFg96NazfEnZeq2/Y1tZt+uYPOhk/xh5S5W7cwjNtzNjy8fxZfPHdStfnAZ01FEhG9dOITbn9nI29uPMmesTzOqTQMcT+61S+2VdvGl9g6fLOWnr2xjZUYu0WFB3HXZSG48bxDh9qSpMS0yZ0xfkuN28ZdVe7hsTB9bD7iVHM883WGR7A925XHnvzdSWa18/9Lh3DRtMJEh1prXmNYIcAlLLkjhrpe2snr3MWZav/dWcbxxWGjQ5zX3rqamRvnTu7u56al1JESG8Ort07n9omGW2I1po4UTk+gbFcJf3tvjdChdluPJ/fTIvbxrjdxPlVTyjb+v5//e2cX88f1Yfts0UmwtU2PahTvQxTfOT2Htvnw2HTrpdDhdkv8k9y5Uc9+efYp5f/6QD3bncd/80fzxugmEuR2vcBnTrXwxLQl3oIuXN9q0yNZwPLnXziIp7SJlmRc3ZHHVwx9TUVXDvxefx03TBtsNH2M6QGRIEBeNSOCNrUeoruneDzl2BMeTe9jpee7+PXKvqq7h3pe38f0XNjNpYG9e//YMJg+yh5GM6Ujzxvcjr7CctfuOOx1Kl+MHyd3/Z8uUVVZz2zOf8Y81B7h1Zgr/uGWqtec1phNcNDKBcHcAr232qZuJqcPx5B4c6MIl+G0nuMKySm5+6lPe2p7DT+el8qPLRxEY4PhlM6ZHCHUHcElqIv/ZdoTK6u7ZoqSjOJ6lRMRv11E9XlTODY+t5dP9+fzxugncbF0cjel088b342RJJR/uPuZ0KF2K48kdPD+d/W0d1cMnS/nio5+wO7eQx25M48qJPq09YoxpZ+cPi6dXSCCvbc52OpQuxS/m7/lbT/fdOYV85Yl1lFRU8c9bziFtcIzTIRnTY7kDXcwZ05c3th6hrLLamu/5yD9G7kEBFPvJQ0wbD57gi3/9hGpVnrv1PEvsxviBeeP7UVRexaqduU6H0mX4RXIP85OyzKf78/nS42vpFRLES0umMapvL6dDMsYA56bEEBfhtlkzLeAnyd35G6plldV8/4XNxEcG8+I3z2NgbJij8RhjPhcY4OKKsX15d0cOReXODwS7Ar9I7qHuAMenQj783h4OHC/hNwvHkhAZ4mgsxpizzRvfj7LKGlamN7SMs6nP12X2okXkRRHZISIZInJevfejROQ1EdksIttF5OaWBOH0DdU9uUU88n4mV07ox7ShcY7FYYxp3KSBvekXFWKzZnzk68j9QWCFqo4ExgMZ9d6/DUhX1fHALOD/RMTtaxBOlmVUlXtf3kZIUAA/viLVkRiMMc1zuYS54/vxwe48TpZUOB2O32s2uYtIFDATeAJAVStUtX4PTgUixdNBKwLIB3wujIW5AxxrHPbKpmw+2Xucuy4bSXyktRQwxp/NG9ePymrlre1HnQ7F7/kyck8G8oCnRGSjiDwuIuH19vkzMArIBrYCd6rqWc8Ki8hiEVkvIuvz8vJObw9zB1BSWY1nne3Oc6qkkl++kc6EAdHcMHVgp57bGNNyY/r3YnBsmM2a8YEvyT0QmAQ8oqoTgWLg7nr7fAHYBPQDJgB/FpGz5hGq6lJVTVPVtPj4z5fOCnUHoArlVZ3bO+J/39pBfnEFv1o4Bpetsm6M3xMR5o/vx8eZx8grLHc6HL/mS3LPArJUda339Yt4kn1dNwPL1GMPsA8Y6WsQYUGd3xnys4MneGbdQW6enszoflGddl5jTNvMG9+PGoU3t9rovSnNJndVPQocEpER3k2zgfR6ux30bkdEEoERwF5fg6jt6V7cSfNXq6pr+PHybSRGhvDdS4Z3yjmNMe1jWGIkI/tE2qyZZvg6W+YO4F8isgVP2eXXIrJERJZ43/8FME1EtgLvAnepqs8t3E6vxtRJS+09/fF+Mo4U8NN5qUQE+0V7HWNMC8wb34/1B05w+GSp06H4LZ8ym6puAtLqbX60zvvZwKWtDaIzF+zIPlnK79/ZxYUj4rlsTJ8OP58xpv3NHdeX+9/ayRtbslk8c4jT4fglv3lCFaCkE6ZD/vy1dKprlJ8vGGNrnxrTRQ2KDWd8UhSvWmmmUX6R3Gtr7h3dguA/W4+wYvtRvj17GANirHeMMV3Z/An92Xa4gD25hU6H4pf8IrmHd0JZJrewjHuWb2Vs/ygWz0zpsPMYYzrH/PH9CHAJyzcedjoUv+QXyf30DdUOSu6qyj3LtlJcUc3vrx1PkK2BakyXFx8ZzIyhcby8MZuams59ALIr8IssV1uW6aia+wsbsliZkcsPvzCCYYmRHXIOY0znWzixP4dPlvLp/nynQ/E7fpLcvWWZDpgKmXWihJ+/ls45yTF8zRa4NqZbuXR0ImHuAF7eZKWZ+vwiuQcHuhCBknZeaq+mRvn+C5tRVR744nhrMWBMNxPmDuSy0X14fYtnfVXzOb9I7iJCWFD793R/+uP9rNmbz0/mpdrsGGO6qYWT+lNYVsV7O2x91br8IrkDhLoD23Ud1T25RfxuxQ4uGpnAtWkD2u24xhj/Mm1IHPGRwTZrph6/Se7tuRpTVXUN33t+E2HuAH579Vh7WMmYbizAJSwY34/3duZyotgW8ajVLZP7w6sy2Zx1il9eaeuhGtMTLJzUn8pq5Q3rFHmaXyX39pjnvu3wKR56dzcLJvTjinF92yEyY4y/S+3bi+GJEbxspZnT/Ci5B7bLPPf7XttOTLibn88f0w5RGWO6AhFh4cQk1h84wcHjJU6H4xf8JrmHtkNZZufRQj7df4JvnJ9CVFhQO0VmjOkKFkzoB2Bz3r38JrmHuQPa3M/92XUHcQe4uHpyUjtFZYzpKvpFh3JuSgzLNx7u9PWY/ZFfJfe2jNxLK6pZ9lkWc8b2ISbc3Y6RGWO6iqsmJrHvWDGbs045HYrjfEruIhItIi+KyA4RyRCR8xrYZ5aIbBKR7SLyfksDCQ0KpKQNy+y9sfUIBWVVXD91YKuPYYzp2i4b24fgQJfdWMX3kfuDwApVHQmMBzLqviki0cDDwHxVHQ18saWBhLkDKKmsbvXHqWfWHiAlPpxzkmNa9f3GmK6vV0gQF6cm8trmbCqra5wOx1HNJncRiQJmAk8AqGqFqp6st9sNwDJVPejdp8XPAYe6A1CF8qqW/4XsOFrAZwdPcsPUgfbAkjE93MIJ/TleXMHq3XlOh+IoX0buyUAe8JSIbBSRx0UkvN4+w4HeIrJKRDaIyI0tDaQt66g+u/Yg7kAXV0+yG6nG9HQzh8fTOyyI5Rt79hJ8viT3QGAS8IiqTgSKgbsb2GcycAXwBeBeERle/0AislhE1ovI+ry8M3+qhreyp3tpRTXLNh7m8jF96G03Uo3p8dyBLuaN78fb249SWFbpdDiO8SW5ZwFZqrrW+/pFPMm+/j5vqWqxqh4DPsBTmz+Dqi5V1TRVTYuPjz/jvdauxvT6lmwK7UaqMaaOKyegLAA7AAAXl0lEQVT2p7yqhhXbjjodimOaTe6qehQ4JCIjvJtmA+n1dnsFmCEigSISBpxDvZuuzWltWeaZdQcZEh/OVLuRaozxmjggmn5RIazMyHE6FMcE+rjfHcC/RMQN7AVuFpElAKr6qKpmiMgKYAtQAzyuqttaEkhoK5J7xpECNh48yf+7YpTdSDXGnCYiXDAigdc2Z1NRVYM70G8e6ek0PiV3Vd0EpNXb/Gi9fe4H7m9tILXrqLakp/uz6zw3Uq+xJ1KNMfXMGhHPs+sOsuHACc4bEut0OJ3Ob36c1ZZlin1caq+koorlnx3mirF9iQ6zG6nGmDNNHxpHUICwalfPXKHJb5J7aFDLbqi+vuUIheV2I9UY07CI4ECmDI5h1Y6eOd/db5L75zdUfSvLPLP2IEMTIpgyuHdHhmWM6cJmjYhnZ04h2SdLnQ6l0/lRcvfOc/ehM2R6dgGbDtkTqcaYps0akQDA+7t63ujdb5J7SJALEd/KMrU3Uq+a1L8TIjPGdFXDEiLoHx3Kqp09r+7uN8ldRAgLar7tb0lFFS9vPMxcu5FqjGmGZ0pkPB/uPkZFK/pWdWV+k9wBQt2BzSb3VTvzKCyv4topAzopKmNMVzZreDzFFdWsP5DvdCidyq+Su2eR7KZvqGYcKSDAJUwYEN1JURljurJp3imR7+/sWXV3v0vuzY3cd+UUMig2jBDv1EljjGlKRHAgU5NjeK+H1d39Krn7skj27pwihidEdlJExpjuYNbwBHblFPWoKZF+ldw9I/fGyzJlldXsP17M8MSITozKGNPVzRrh6UK7qgeVZvwquYcGNX1DdW9eMTUKwxJt5G6M8d3QHjgl0q+Se5g7gNImHmLanVsIwHBL7saYFhARZo2I56M9PWdKpN8l96ZG7rtyCgl0Cclx9Vf5M8aYps0akeCZErm/Z0yJ9LPkHtjkE6q7cooYHBfeI3szG2PaZtqQWNwBLlb1kFYEfpUla2+oqmqD7+/KKWSElWSMMa0QHhzIlOTePabu7lNyF5FoEXlRRHaISIaInNfIflNEpEpErmlNMKHuAGoUyhuoiZVWVHMwv4RhNlPGGNNKF47wTIk83AOmRPo6cn8QWKGqI/EsfH3W+qgiEgD8Dni7tcGENbFIdmZeEap2M9UY03qfT4ns/qP3ZpO7iEQBM4EnAFS1QlVPNrDrHcBLQKuv2ume7g3MmNmVUztTxkbuxpjWGRJfOyWy+9fdfRm5JwN5wFMislFEHheRM6ariEh/YCHwSFuCCa3t6V5+9oNMu3KKCAoQBsXaTBljTOvUnRJZXuXbqm9dlS/JPRCYBDyiqhOBYuDuevv8EbhLVZucQCoii0VkvYisz8s7+ydnWFDtakxnX/TdOYWkxEUQFOBX94CNMV3MhSMSKKmoZv3+E06H0mJ/+3i/z/v6kimzgCxVXet9/SKeZF9XGvBvEdkPXAM8LCJX1j+Qqi5V1TRVTYuPjz/rRJ8vtddAWSa30G6mGmPabNpQ75TILlh3f+7TQz7v22xyV9WjwCERGeHdNBtIr7dPsqoOVtXBeJL/t1T1ZZ+j8AqtvaFaeWZZpqSiikP5pXYz1RjTZmFuT5fIrlZ3r6lR9h4r8nl/X2scdwD/EpEtwATg1yKyRESWtCLGRp1eR7XeyH13jucPZMndGNMeLhyZwO7cIjYfamhuiH86UlBGWaXvrRN8Su6quslbThmnqleq6glVfVRVH21g36+q6ostiPm0xsoyNlPGGNOerk1LIi7CzS9eT2/0oUl/k5nr+6gd/PAJVTh7nvvu3CLcgS6bKWOMaReRIUF879IRrD9wgje2HnE6HJ9k5nXp5N5wWWZXTiFD4iMIcIkTYRljuqFr0wYwsk8kv3lzB2VNdKP1F5l5RfQKCfR5f79K7iFBLkQ4ax3V3TlFVpIxxrSrAJfwk3mpHD5ZyhMf7nM6nGZl5hYzJMH3POhXyV1ECA0KoLjOyL2ovIrDJ22mjDGm/U0bEselqYk8/N4ecgvKnA6nSXuPFTEkvosmdzi7p/tu783UYS34iWWMMb665/JRVFTX8MDbO50OpVGFZZXkFJR37eQe6g44oyxj0yCNMR1pcFw4X502mBc2ZLHt8Cmnw2nQ3rxiAFLifZ9U4nfJPazeOqo7cwoJCXIxICbMwaiMMd3Z7RcNo3eY/06NrJ0p0/VH7nXuXO/KKWRogs2UMcZ0nKjQIP7nkuGs3ZfPW9uPdui5tmef4lRpZYu+JzOviECXMCjW90Gu3yX38OD6NfcihidYScYY07EWTRnA8MQIfvVmRod1jCyvqubqRz7moXd3t+j79uYVMzA2rEWNE/0uuYfWKcucKq3kaEEZw6zebozpYIEBLu6dm8qh/FKe+mh/h5xjd04RZZU1fHawZR0pM/OKSIlr2aQSv0vuYXVuqO7JtbYDxpjOc/6weGaPTODP/91DXmF5ux8//UiB59fsAiqrfesTU1Vdw/5jJQxJaNkT+n6Z3GtH7rtspowxppPdc8UoyiqreeCt9p8amZ7tSe7lVTXsPFro0/dknSilorqmRTdTwQ+Tu2cqZG1yLyQ0KID+0aEOR2WM6SmGxEdw8/TBPLf+EBsO5LfrsdOPFJDYKxiALVm+TbusbfPb5ZN7mDuA4ooqVJXdOUUMS4zAZTNljDGd6M6Lh9M3KoQfL9/mc/mkOapKRnYBF49KJDosiC1ZvrUbzsz1zHEf0oI57uCXyT2QGvV8bNmVU2glGWNMp4sIDuSn80az42ghT7fTzdWsE6UUlleR2q8XY/tHsdnHkXtmXhGx4W6iw9wtOp/fJfdQ7zqqR06VkVtYbjdTjTGO+MLoRGaPTOAPK3eRfbK0zcfb7q23p/btxfikaHblFJ7V3rwhmXkt6ylTy++Se21P99qPLDYN0hjjBBHhZ/NHU6PKfa9tb/Px0o8U4BIY2acX45KiqK5R0o80P3rPzCtu8UwZ8DG5i0i0iLwoIjtEJENEzqv3/pdEZIuIbBWRj0VkfIsj8apdR3WTd/krK8sYY5wyICaMb88exlvbc1iZntOmY6VnF5AcF06oO4DxA6IB2Hyo6eR+oriC/OKKDh25PwisUNWRwHggo977+4ALVHUs8AtgaYsj8Qr3Ltix+dBJIoID6RcV0tpDGWNMm319RgrDEiL46avbKam31kRLZBwpILVfFACJvUJI7BXc7E3V2pkyLWkYVqvZ5C4iUcBM4AkAVa1Q1TMiUtWPVbX2kas1QFKLI/GqLctsyy5gaEIEIjZTxhjjHHegi19eOYbDJ0v503/3tOoYJ0sqOHyylNS+vU5vG5cUzZZmulB+PlOmY0buyUAe8JSIbBSRx0WkqR8jtwD/aegNEVksIutFZH1eXl6D31xblqmoqrGbqcYYv3BOSixfnJzEYx/sZVeObw8f1VX7ZGpqv8+T+/ikKPbmFVNQ1ngTscy8ItwBLpJ6t7wrri/JPRCYBDyiqhOBYuDuhnYUkQvxJPe7GnpfVZeqapqqpsXHxzd4stp1VMHq7cYY//Gjy0cRERLI/1u+rcVtgdPrzJSpNS7JU3ff1sSUyMy8IpLjwlvVFdeX5J4FZKnqWu/rF/Ek+zOIyDjgcWCBqh5vcSRetWUZsORujPEfMeFufjRnJOv25/PihqwWfW/6kQLiI4OJjww+vW1ckqf+3tR8972tnCkDPiR3VT0KHBKREd5Ns4H0uvuIyEBgGfAVVd3Vqki8Qi25G2P81BcnDyBtUG9+/WYGJ4orfP6+9OyCM0btANFhbgbFhjV6U7WiqoYD+SUt7gZZy9fZMncA/xKRLcAE4NciskRElnjf/wkQCzwsIptEZH2rouHzkXtkSODpHgzGGOMPXC7hvgWjOVFSyfKNh336noqqGjLzis6ot9calxTdaI+Zg/nFVNdoq0fugc3vAqq6CUirt/nROu9/Hfh6qyKoJyTQk9yHJ0baTBljjN8Z3S+KoQkRrMzI4Wszkpvdf3duIZXVetbIHTw3VV/bnM2xonLiIs4czO5pw0wZ8MMnVF0uoVdIICP6WEnGGOOfLklNZO2+fE6VNL9c3umbqY2M3IEGSzO166amdJfkDrD0xjTunD3M6TCMMaZBl6QmUl2jrNqV2+y+6UcKCA0KYHDs2eWVMf174ZKGn1Tdm1dMYq9gIoJ9KrCcxS+T+7kpsST2sidTjTH+aUJSNHERwbztQ0uC9OwCRvaNbHA6Y5g7kGEJkY2O3FtbkgE/Te7GGOPPXC7h4lEJvL8zj4qqxvu9qyrpR86eKVPXuKQotmSdOmPuvKpacjfGGCdckppIUXkVa/Y2/lhP1olSCsuqGqy31xo3IJrjxZ72BLXyisopLKtq8QIddVlyN8aYVpg+NI7QoABWZjRemjnddqCJkft478NMdadEnu4pk2Ajd2OM6VQhQQGcPyyOlek5jbYjSM/+vId7Y0b26YU7wMXmOnX3z7tBWnI3xphOd3FqItmnyk6vslRf+pHPe7g3xh3oYlTfSLYcOnPkHhoUQN82TCyx5G6MMa00e2QCLoF3Gpk1k579eQ/3poxLimbb4VPU1Hg+AWTmFZESH46rFQ3DallyN8aYVoqNCGbyoN4NJvdTJZVn9XBvzNikKArLq9h7zFNrb+tMGbDkbowxbXLxqETSjxSQdaLkjO0N9XBvzPg6T6qWVVZz+GRpq1ZfqsuSuzHGtMElqYkAvJtx5tOqvsyUqTU0IYIwdwBbsk6x71gxqq3vKVPLkrsxxrRBSnwEQ+LDzyrNpGef3cO9MQEuYUy/KDZnnTzdU8aSuzHGOOzi1ETW7D1+xpJ5zT2ZWt+4pCjSswvYebQQEUiOs7KMMcY46tLURKpqlFU7PWtDV1TVsCe30Kd6e61xA6Ipr6rhza1H6B8d2uT0SV9YcjfGmDaaMKA3seHu06WZpnq4N6b2SdXMvOI2PbxUy6fkLiLRIvKiiOwQkQwROa/e+yIiD4nIHhHZIiJnrbFqjDHdVYBLmD0qgVU7c6moqmmyh3tjBsaEER0WBNCmnjK1fB25PwisUNWRwHggo977c4Bh3q/FwCNtjswYY7qQS1L7UFhWxbp9+WQcKWy0h3tjRISx/T2j97beTAUfkruIRAEzgScAVLVCVes3H14A/F091gDRItK3zdEZY0wXMWNoHCFBLlZm5JB+5FSjPdybUjvfvVOSO5AM5AFPichGEXlcROr/OOoPHKrzOsu7zRhjeoRQdwAzhsbzTnoO6dkFjGpBvb3WZWP6MHFgNGP6t/x76/MluQcCk4BHVHUiUAzc3ZqTichiEVkvIuvz8vJacwhjjPFbl6YmcvhkKQVlVS26mVprTP8oln9rOpEhQW2OxZfkngVkqepa7+sX8ST7ug4DA+q8TvJuO4OqLlXVNFVNi4+Pb028xhjjty4cmYB4KzEtuZnaEZpN7qp6FDgkIiO8m2YD6fV2exW40Ttr5lzglKoead9QjTHGv8VHBjNpYG9EYGSfSEdj8XVZ7TuAf4mIG9gL3CwiSwBU9VHgTeByYA9QAtzcAbEaY4zfu/3Coaw/kE+Y29f02jGksRVEOlpaWpquX7/ekXMbY0xXJSIbVDWtuf3sCVVjjOmGLLkbY0w3ZMndGGO6IUvuxhjTDVlyN8aYbsiSuzHGdEOW3I0xphuy5G6MMd2QYw8xiUghsNORkzctDjjmdBANsLhaxuJqGX+NC/w3NqfiGqSqzTbncvL52J2+PGXV2URkvcXlO4urZSyulvPX2Pw1rlpWljHGmG7IkrsxxnRDTib3pQ6euykWV8tYXC1jcbWcv8bmr3EBDt5QNcYY03GsLGOMMd1QpyR3EXlSRHJFZFudbTEi8o6I7Pb+2rszYvEhrp+JyGER2eT9utyBuAaIyHsiki4i20XkTu92R69ZE3E5es1EJERE1onIZm9c93m3J4vIWhHZIyLPeReb8Ye4nhaRfXWu14TOjKtOfAHeRe9f97529Ho1EZfj10tE9ovIVu/513u3OZ7DmtJZI/engcvqbbsbeFdVhwHv0spFt9voac6OC+APqjrB+/VmJ8cEUAV8T1VTgXOB20QkFeevWWNxgbPXrBy4SFXHAxOAy7zLPf7OG9dQ4ARwi5/EBfCDOtdrUyfHVetOIKPOa6evV636cYF/XK8Lveevnf7o9P/HJnVKclfVD4D8epsXAH/z/v5vwJWdEUtdjcTlOFU9oqqfeX9fiOcfen8cvmZNxOUo9SjyvgzyfilwEZ4F3cGZ69VYXI4TkSTgCuBx72vB4evVUFx+zvEc1hQna+6JdRbRPgokOhhLfbeLyBZv2cbRj1oiMhiYCKzFj65ZvbjA4Wvm/Si/CcgF3gEygZOqWuXdJQsHfhDVj0tVa6/Xr7zX6w8iEtzZcQF/BH4I1Hhfx+IH16uBuGo5fb0UeFtENojIYu82v/n/2BC/uKGqnik7fjGiAR4BhuD5GH0E+D+nAhGRCOAl4DuqWlD3PSevWQNxOX7NVLVaVScAScBUYGRnx9CQ+nGJyBjgR3jimwLEAHd1ZkwiMhfIVdUNnXne5jQRl6PXy2uGqk4C5uApR86s+6af5TDA2eSeIyJ9Aby/5joYy2mqmuP9D1kDPIYnUXQ6EQnCk0D/parLvJsdv2YNxeUv18wby0ngPeA8IFpEaltsJAGH/SCuy7zlLVXVcuApOv96TQfmi8h+4N94yjEP4vz1OisuEfmnH1wvVPWw99dcYLk3Bsf/PzbFyeT+KnCT9/c3Aa84GMtptX9ZXguBbY3t24ExCPAEkKGqv6/zlqPXrLG4nL5mIhIvItHe34cCl+C5H/AecI13NyeuV0Nx7aiTEARPnbZTr5eq/khVk1R1MLAI+K+qfgmHr1cjcX3Z6eslIuEiEln7e+BSbwx+mcNOU9UO/wKexfNxvRJPLe8WPDW+d4HdwEogpjNi8SGufwBbgS14/vL6OhDXDDwf8bYAm7xflzt9zZqIy9FrBowDNnrPvw34iXd7CrAO2AO8AAT7SVz/9V6vbcA/gYjO/jdWJ8ZZwOv+cL2aiMvR6+W9Lpu9X9uBH3u3O57DmvqyJ1SNMaYb8osbqsYYY9qXJXdjjOmGLLkbY0w3ZMndGGO6IUvuxhjTDVlyN8aYbsiSu+nWRGSWiEyr8/ppEbmmqe8xpjuw5G66u1nAtOZ28oV42P8Z0yXYP1Tj90RksIjs8I66d4nIv0TkYhH5yLtQwlTvwgkvezsHrhGRcd7OlUuA73oXWTjfe8iZIvKxiOytO4oXkR+IyKfeY9xX59w7ReTveJ6QHNBIjI+IyHqpsyiHd/vl3tg3iMhDdRagCPd20FwnnoUpFnTIxTM9lj2havyeN0nvwdNieDvwKZ5HwW8B5gM3A4eAY6p6n4hcBPxeVSeIyM+AIlV9wHusp4Fw4Do8nQZfVdWhInIpnr4qtwKCp43C/wIHgb3ANFVd00SMMaqaLyIBeB5J/zawC8+j6TNVdZ+IPAtEqupcEfk1kK6q//T2n1kHTFTV4na5aKbHC2x+F2P8wj5V3QogItvxrICjIrIVGAwMAq4GUNX/ikisiPRq5Fgvq6eDZbqI1PbgvtT7tdH7OgIYhie5H2gqsXtd6+3zHQj0BVLxfDLeq6r7vPs8C9T2Ar8UTwfE73tfhwADOXsFImNaxZK76SrK6/y+ps7rGjz/jitbeSyp8+tvVPWvdXf0fmpocjQtIsnA94EpqnrC++kgpJkYBLhaVXf6HrYxvrOau+kuVgNfAs8MGTwlmgKgEIj04fvfAr7mXYgEEekvIgk+nrsXnh8Ap7yfBOZ4t+8EUrw/IMBTCqp7vju8bWwRkYk+nssYn9jI3XQXPwOeFJEtQAmf99l+DXjRe8Pyjsa+WVXfFpFRwCfefFsEfBmobu7EqrpZRDYCO/DU/j/ybi8VkW8BK0SkGM+9glq/wLOk3BbvDJx9wFzf/7jGNM1uqBrTgUQkQlWLvCP0vwC7VfUPTsdluj8ryxjTsb4hngWytwNRwF+b2d+YdmEjd2NaQETWAsH1Nn+ldiaPMf7CkrsxxnRDVpYxxphuyJK7McZ0Q5bcjTGmG7Lkbowx3ZAld2OM6Yb+P9Lhva20tzM7AAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.plot(x='mother_age', y='avg_wt');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking at the distribution (histogram) of the raw mother's age makes the weird behavior at the edges clear. We don't have enough data for mothers in their low-teens and in their fifties. In statistical terms, these are outliers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZMAAAELCAYAAAAcKWtPAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzt3Xl4VOXZ+PHvnckGJCEEwhYCCbuIAhJQQS24ILihrUVtrdSl1hb7Wltb9W19ra2+bd/u/mqxLghqXagrKq1SNxRECIsYCEvCmhBCEpZsZJvcvz/mBAfMRrYzy/25rrnmzDPPOefOgcydZ5nniKpijDHGtEeE2wEYY4wJfpZMjDHGtJslE2OMMe1mycQYY0y7WTIxxhjTbpZMjDHGtJslE2OMMe1mycQYY0y7WTIxxhjTbpFuB9BV+vTpo2lpaW6HYYwxQWXt2rXFqprcUr2wSSZpaWlkZma6HYYxxgQVEdndmnrWzWWMMabdLJkYY4xpN0smxhhj2i1sxkyMMYGvtraWvLw8qqqq3A4l7MTGxjJo0CCioqLatL8lE2NMwMjLyyM+Pp60tDRExO1wwoaqUlJSQl5eHunp6W06hnVzGWMCRlVVFb1797ZE0sVEhN69e7erRWjJxBgTUCyRuKO91926uUzIUVUqa7wcPlrLoYoajhyt5XBlLUeO1hITGUF8bCTxsVEkdIskITaK+NhI4mIiifTY31bGtJUlExPUar31fLb3MCtzS1iZW0xuUQVHKmup8daf9LGSekQzITWRjLQkMtJ6cVpKT2KjPJ0QtTGhx5KJCSreeiW7oJSVucWszC1h9c6DVNZ4EYFTByZw/qi+9OoRTa/uUSR2j6Jnt2gSu0fRq3s0Cd0iqamrp/RoHWVVtZRW1VFaVUtZle91/qGjrN1ziHe3HAAg2hPBaYN6kpHWi0lDkjh7WG96xNivjOk406ZN4/e//z0ZGRmtqr9w4UIyMzP561//+qX3pkyZwsqVKzs6xFaz3wwTFPIPH2Xhip38c20ehytrARjeN46rJw5iyrDenDW0N4ndozvkXCXl1azdfYi1uw+xZtdBFny8k79/uIP+CbE8MPtULj61f4ecx5iO5GYiAUsmJsBtzDvM4x/tZOnnBQDMHNufGWP6cfbQ3vRNiO2Uc/aOi2HGqf2Z4SSNqlova3Yd5KG3svnuM2uZeWp/Hph9Kv066fzG54E3NrF5X2mHHnPMwATuv/zUZuvs2rWLWbNmcc4557By5UpSUlJ4/fXXmTVr1rFWRHFxMRkZGezatYuFCxfy2muvUVFRwfbt27nrrruoqanhmWeeISYmhqVLl5KUlNTk+Z555hluueUW6urqWLBgAZMnT2b16tXccccdVFVV0a1bN5566ilGjRoFwN69e5k2bRr5+flcf/313H///QDExcVRXl4OwO9+9zsWL15MdXU1V111FQ888AAVFRXMmTOHvLw8vF4v9913H9dcc00HXdlWJBMRWQBcBhxQ1bFO2YvAKKdKInBYVceLSBqQDWx13lulqrc5+0wEFgLdgKXAHaqqIpIEvAikAbuAOap6SHxTC/4CXAJUAt9W1XXOseYCP3fO8aCqLmrjz28CkLdeeTe7kCc+2snqXQeJj4nkpqlpfHtqOimJ3bo8ntgoD+eOSOaNH/TmiY928uf/bGPFH4q5e9ZovjF5MBERNvso1Gzfvp3nn3+exx9/nDlz5vDyyy83Wz8rK4v169dTVVXF8OHD+e1vf8v69eu58847efrpp/nhD3/Y5L6VlZVs2LCB5cuXc9NNN5GVlcXo0aP56KOPiIyM5D//+Q///d//fSyG1atXk5WVRffu3Zk0aRKXXnrpcd1k77zzDtu3b2f16tWoKldccQXLly+nqKiIgQMH8tZbbwFw5MiRDrhSX2hNy2Qh8Ffg6YYCVT2WzkTkD4B/VLmqOr6R48wHvgN8ii+ZzAT+BdwDvKuqvxGRe5zXdwOzgBHO40xn/zOd5HM/kAEosFZElqjqodb8wCZwVVTX8fK6PBZ8vJNdJZWkJHbj55eewjWTUomPbdu3cjtSlCeC700bxqyx/fnZa5/z89eyeG19Pr/+6mmM6Bfvdnghp6UWRGdKT09n/Hjfx9jEiRPZtWtXs/WnT59OfHw88fHx9OzZk8svvxyA0047jY0bNza773XXXQfAeeedR2lpKYcPH6asrIy5c+eyfft2RITa2tpj9S+66CJ69+4NwFe/+lU+/vjjLyWTd955hwkTJgBQXl7O9u3bOffcc/nxj3/M3XffzWWXXca55557chelBS0mE1Vd7rQ4vsRpPcwBzm/uGCIyAEhQ1VXO66eBK/Elk9nANKfqIuADfMlkNvC0qiqwSkQSneNMA5ap6kHnWMvwJabnW/pZTGDae7CSRSt38WLmXsqq6hg3qCf/77oJzBrbPyCn66b16cGzN5/Jy+vyefCtzVzy8Ed8b9pw/uv84QEZrzl5MTExx7Y9Hg9Hjx4lMjKS+nrfLMETv9znXz8iIuLY64iICOrq6po914nf7xAR7rvvPqZPn86rr77Krl27mDZtWrP1/akq9957L9/97ne/dK5169axdOlSfv7zn3PBBRfwP//zP83GdjLaO2ZyLlCoqtv9ytJFZD1QCvxcVT8CUoA8vzp5ThlAP1UtcLb3A/2c7RRgbyP7NFVugoiqsmrHQZ5asZP/ZBciIswa258bp6ZzxuDEgP/imohw9cRBTBuVzINvbubhd7eTd6iS3189zrq9QlRaWhpr165l8uTJvPTSSx123BdffJHp06fz8ccf07NnT3r27MmRI0dISfF9rC1cuPC4+suWLePgwYN069aN1157jQULFhz3/sUXX8x9993HN7/5TeLi4sjPzycqKoq6ujqSkpK4/vrrSUxM5IknnuiwnwHan0yu4/gWQQEwWFVLnDGS10Sk1W1VZwxF2xnTMSJyK3ArwODBgzvqsKYdjtZ4eeOzfTy1chfZBaX06h7FbV8ZxrfOHsKAnl0/HtJefeJi+PO1ExiaHMcfl20jITaK+y8fE/DJ0Jy8u+66izlz5vDYY49x6aWXdthxY2NjmTBhArW1tccSw09/+lPmzp3Lgw8++KVzTZ48ma997Wvk5eVx/fXXf2la8YwZM8jOzubss88GfAPzzz77LDk5OfzkJz8hIiKCqKgo5s+f32E/A4D4epFaqOTr5nqzYQDeKYsE8oGJqprXxH4fAHc59d5X1dFO+XXANFX9rohsdbYLnG6sD1R1lIj83dl+3tlnK74urmkN+zrlx9VrSkZGhtqdFt1R561nRW4Jr6/P5+1N+6mo8TK6fzw3Tk1j9viUkPhioKry0FvZPPHxTv7rghH86KKRbocUlLKzsznllFPcDiNsNXb9RWStqrb4RZj2tEwuBLb4JxIRSQYOqqpXRIbiGzzfoaoHRaRURM7CNwB/A/D/nN2WAHOB3zjPr/uV3y4iL+AbgD/iJJy3gf8VkV5OvRnAve34OUwnUFXW7z3Mkg37eHPjPorLa0iIjeTycQO5ckIKZ6YnhdRf7yLCzy49hdKqWh5+dzsJsZHccu5Qt8Mypsu0Zmrw8/haA31EJA+4X1WfBK7ly4Pe5wG/FJFaoB64rWGgHPg+X0wN/pfzAF8SWSwiNwO78Q3og2/G1yVADr6pwTcCOInpV8Aap94v/c5hXFbnrefRD3P559o8dpdUEh0ZwYWn9OWKcSlMH51MTGTwt0KaIiL8+qunU1ZVx4NvZZMQG8WcSaluh2UCwLx581ixYsVxZXfccQc33nijSxF1vFZ1c4UC6+bqGr98YzMLVuxkyrDeXDkhhZlj+5MQANN6u1J1nZdbFmWyIqeYR75xBrNOG+B2SEEjOzub0aNHh1SrNVioKlu2bGlzN5fNYzQd5rlP97BgxU5unJrGc985izkZqWGXSABiIj38/VsTmTC4F//1wnqWbytyO6SgERsbS0lJCeHyR26gaLg5Vmxs21d1sJaJ6RArc4u54cnVTB3ehyfnZtj3LYAjlbVc89gn7C6p5NlbJjNxSNNLahgfu22ve5q6bW9rWyaWTEy77SquYPYjK0iOj+GV708Jy9ZIU4rKqvn6oyspr/by/l1fCYhv8htzMqyby3SJI0druXnRGiIEnpybYYnkBMnxMfzl2gkUl1fz1/dz3A7HmE5jycS0WZ23ntufW8fukkrmXz+RIb17uB1SQBqXmsjXzhjEUx/vYndJhdvhGNMpLJmYNnvwrWw+2l7MQ1eN5ayhvd0OJ6D9dOYoIj3CQ29lux2KMZ3Ckolpk2dX7Wbhyl3cck4610yypWpa0i8hlnnTh/PO5kJW5hS7HY4xHc6SiTlpK3OKuX/JJqaPSubeS2zpi9a6+Zx0BvXqxi/f3ExdG+5Rb0wgs2RiTsrO4gq+9491DEvuwcPXTcBjK+S2WmyUh/++5BS27C/jhTV7W97BmCBiycS0mv/MrSdumGTTXNtg1tj+TE5P4o/LtnHkaG3LOxgTJCyZmFZpmLm192Alj14/kcG9u7sdUlASEf7nsjEcqqzh4Xe3t7yDMUHCkolplYaZWw9eOZYzbeZWu4xN6ck1GaksWrmL3KJyt8MxpkNYMjEt+senNnOro/14xihiozw2VdiEDEsmplkrc4u5/3WbudXRkuNj+MH5w3lvywE+tIUgTQiwZGKatKu4gu89u470PjZzqzN8e2oaQ3p351dvbqbWpgqbIGfJxDTq+DW3bOZWZ4iJ9PCzS04h50C5TRU2Qc+SifmSOm89P3h+/bE1t2zmVue5aEw/Jg7pxaMf5FrrxAQ1SybmS15el8fybUX86kpbc6uziQi3Tx9O/uGjvLY+3+1wjGmzFpOJiCwQkQMikuVX9gsRyReRDc7jEr/37hWRHBHZKiIX+5XPdMpyROQev/J0EfnUKX9RRKKd8hjndY7zflpL5zAd44U1exnRN45r7f7lXWLaqGTGDEhg/ge5eOvD4/5CJvS0pmWyEJjZSPmfVHW881gKICJjgGuBU519/iYiHhHxAI8As4AxwHVOXYDfOscaDhwCbnbKbwYOOeV/cuo1eY6T+7FNU7YXlrF+z2HmZKTafbi7iIgwb/pwdhRX8O+s/W6HY0ybtJhMVHU5cLCVx5sNvKCq1aq6E8gBJjuPHFXdoao1wAvAbPF9Wp0PvOTsvwi40u9Yi5ztl4ALnPpNncN0gMWZe4mMEK46I8XtUMLKzLH9GZrcg7++n2P3PzdBqT1jJreLyEanG6yXU5YC+E9LyXPKmirvDRxW1boTyo87lvP+Ead+U8cy7VTrreeVdflccEpf+sTFuB1OWPFECN/7yjCyC0r5YKt978QEn7Ymk/nAMGA8UAD8ocMi6kAicquIZIpIZlGR/YK25N3sA5RU1HCNjZW44soJKaQkdrPWiQlKbUomqlqoql5VrQce54tupnzA/5NokFPWVHkJkCgikSeUH3cs5/2eTv2mjtVYnI+paoaqZiQnJ7flRw0rizP30jc+hvNG2LVyQ5Qngtu+MpS1uw/x6c7W9iwbExjalExEZIDfy6uAhpleS4BrnZlY6cAIYDWwBhjhzNyKxjeAvkR9f369D1zt7D8XeN3vWHOd7auB95z6TZ3DtENhaRUfbD3A1RMHEemxGeNu+XpGKn3iYnjk/Ry3QzHmpES2VEFEngemAX1EJA+4H5gmIuMBBXYB3wVQ1U0ishjYDNQB81TV6xznduBtwAMsUNVNzinuBl4QkQeB9cCTTvmTwDMikoNvAsC1LZ3DtN1La/OoV9+HmXFPbJSH75ybzq//tYXP9h5mXGqi2yEZ0yoSLn2zGRkZmpmZ6XYYAUlVmf77D+ibEMvi757tdjhhr7y6jqm/eY8z05N47IYMt8MxYU5E1qpqi/8RrT/DsHrnQXaVVHKNtUoCQlxMJN+eksY7mwvZur/M7XCMaRVLJobFmXnExUQy67T+bodiHDdOTaN7tIf5H9jYiQkOlkzCXFlVLUs/L+DycQPpHt3iEJrpIondo7n+rCEs+Wwfu0sq3A7HmBZZMglzb24s4GitlzkZg9wOxZzglnPSifRE8OiHO9wOxZgWWTIJcy+u2cvIfnGMt1lDAadvQixzMgbx0tq9FBw56nY4xjTLkkkY21ZYxoa9tqhjILvtK8NQhceWW+vEBDZLJmFs8RpnUccJtrRZoBrUqztXTUjhuU/3UFRW7XY4xjTJkkmYqqmr59X1+Vx4Sj9626KOAe3704dT663niY+tdWIClyWTMPXelkJb1DFIpPfpweXjBvLsJ7s5VFHjdjjGNMqSSZhanJlHv4QYzh3Rx+1QTCvMmz6cihovT63Y6XYoxjTKkkkYqqmr56PtRVx++kBb1DFIjOwXz8xT+/PUyl2UVtW6HY4xX2KfJGEot6icWq9yuk0HDiq3nz+csqo6nvlkt9uhGPMllkzC0Jb9pQCc0j/e5UjMyRib0pPzR/fliY92UFFd1/IOxnQhSyZhKLugjOjICNL79HA7FHOS5k0fzqHKWp77dI/boRhzHEsmYSi7oJSR/eJsvCQITRzSi6nDe/PYRzuoqrXb+JjAYZ8mYSi7oIxT+ie4HYZpo9unj6CorJrFmXvdDsWYYyyZhJmismqKy6sZPcCSSbA6a2gSk9J68egHudTU1bsdjjGAJZOw03CzJRt8D14iwu3nj2DfkSpeWZfndjjGAK1IJiKyQEQOiEiWX9nvRGSLiGwUkVdFJNEpTxORoyKywXk86rfPRBH5XERyRORhcVYWFJEkEVkmItud515OuTj1cpzznOF3rLlO/e0iMrcjL0ioyy7wzeSylklwO29EH04f1JO/fZBLnddaJ8Z9rWmZLARmnlC2DBirqqcD24B7/d7LVdXxzuM2v/L5wHeAEc6j4Zj3AO+q6gjgXec1wCy/urc6+yMiScD9wJnAZOD+hgRkWpa9v5R+CTEk9Yh2OxTTDiLC7dOHs+dgJUs+2+d2OMa0nExUdTlw8ISyd1S1YaL7KqDZOyuJyAAgQVVXqaoCTwNXOm/PBhY524tOKH9afVYBic5xLgaWqepBVT2EL7GdmOxME7ILyjjFWiUh4aIx/RjdP56/f7gD36+VMe7piDGTm4B/+b1OF5H1IvKhiJzrlKUA/p27eU4ZQD9VLXC29wP9/PbZ28g+TZWbFtR668k5UMZom8kVEkSEm85JZ2thGStzS9wOx4S5diUTEfkZUAf8wykqAAar6gTgR8BzItLqTy6n1dJhf2KJyK0ikikimUVFRR112KC1o6iCWq9yygAbfA8VV4wbSO8e0Sz42BaANO5qczIRkW8DlwHfdJIAqlqtqiXO9logFxgJ5HN8V9ggpwyg0Om+augOO+CU5wOpjezTVPmXqOpjqpqhqhnJyclt/ElDR8Pgu3VzhY7YKA/fPGsI7209wM7iCrfDMWGsTclERGYCPwWuUNVKv/JkEfE420PxDZ7vcLqxSkXkLGcW1w3A685uS4CGGVlzTyi/wZnVdRZwxDnO28AMEenlDLzPcMpMC7L3lxLtsWVUQs31Zw0mMkJYtHKX26GYMNaaqcHPA58Ao0QkT0RuBv4KxAPLTpgCfB6wUUQ2AC8Bt6lqw+D994EngBx8LZaGcZbfABeJyHbgQuc1wFJgh1P/cWd/nOP9CljjPH7pdw7TjOyCMkb0iyPKllEJKX3jY7n89IH8M3OvLU9vXBPZUgVVva6R4iebqPsy8HIT72UCYxspLwEuaKRcgXlNHGsBsKDpqE1jthSUcu4I6+4LRTdOTeeV9fksXrOXW84d6nY4JgzZn6hhoqS8mgNl1Tb4HqJOG9STSWm9WLhyF956myZsup4lkzCxpWEZFRt8D1k3TU0n79BRlm0udDsUE4YsmYSJY8uo2JpcIeuiMf1ISexm94k3rrBkEiayC8pIjo+hd1yM26GYThLpiWDulCF8uvMgWflH3A7HhBlLJmFiy/5S6+IKA9dkDKZ7tIenVuxyOxQTZiyZhIE6bz3bC8tt2fkw0LN7FFdPHMQbn+2jqKza7XBMGLFkEgZ2FFdQ4623lkmY+PaUNGq89fzj091uh2LCiCWTMPDFPUysZRIOhibHMX1UMs+u2k11nd0n3nQNSyZhILugjCiPMLRPnNuhmC5y0znpFJfX8OZnBS1XNqYDWDIJA1v2lzK8bzzRkfbPHS7OGd6HEX3jWLBip93rxHQJ+3QJA1sKymzwPcyICDdOTWfTvlLW7DrkdjgmDFgyCXGHKmrYX1plg+9h6KoJKcTHRvKcDcSbLmDJJMRl77fB93DVLdrDVRNSWJq1n0MVNW6HY0KcJZMQl13gW5PLbtUbnq6dNJiaunpeWd/o/eOM6TCWTELcloJS+sTFkBxvy6iEozEDExifmsjzq/fYQLzpVJZMQtyW/WW27HyY+8aZg8k5UE7mbhuIN53HkkkIq/PWs7WwzAbfw9xlpw8gPiaS5z/d43YoJoRZMglhu0oqqKmrt2Xnw1z36EiunJDCm58XcLjSBuJN57BkEsI22+C7cVw32TcQ/6oNxJtO0qpkIiILROSAiGT5lSWJyDIR2e4893LKRUQeFpEcEdkoImf47TPXqb9dROb6lU8Ukc+dfR4WEWnrOcwXthSUEhkhDO9ry6iEuzEDExhnA/GmE7W2ZbIQmHlC2T3Au6o6AnjXeQ0wCxjhPG4F5oMvMQD3A2cCk4H7G5KDU+c7fvvNbMs5zPG27C9jeN84W0bFAPCNyalsKyxn3R4biDcdr1WfMqq6HDh4QvFsYJGzvQi40q/8afVZBSSKyADgYmCZqh5U1UPAMmCm816Cqq5S359MT59wrJM5h/GTXWA3xDJfuOz0gcTFRPIPG4g3naA9f7L2U9WGJUn3A/2c7RRgr1+9PKesufK8Rsrbco7jiMitIpIpIplFRUUn8aMFv8OVNRQcqbLBd3NMj5hIZo8fyFsbCzhSWet2OCbEdEj/h9Oi6NSO2LacQ1UfU9UMVc1ITk7upMgC07FvvlvLxPi5bvJgquvqeXV9XsuVjTkJ7UkmhQ1dS87zAac8H0j1qzfIKWuufFAj5W05h3Fscdbksi8sGn9jU3py+qCePL96rw3Emw7VnmSyBGiYkTUXeN2v/AZnxtVZwBGnq+ptYIaI9HIG3mcAbzvvlYrIWc4srhtOONbJnMM4Nu8rpXePaJLjbBkVc7zrJg9ma2EZ6/YcdjsUE0JaOzX4eeATYJSI5InIzcBvgItEZDtwofMaYCmwA8gBHge+D6CqB4FfAWucxy+dMpw6Tzj75AL/cspP6hzmC5v2lXJqSk+cWdbGHHPFuIH0iPbw/GobiDcdJ7I1lVT1uibeuqCRugrMa+I4C4AFjZRnAmMbKS852XMYqK7zsq2wjGmjwmucyLROj5hIZk9I4ZV1edx32Rh6dotyOyQTAuwLCCFo2/5y6uqVsSk93Q7FBKhvTB5MVW09r2+woUbTMSyZhKCsfUcAGDvQkolp3NiUnpyW0pPnPrVvxJuOYckkBGXlHyE+NpLUpG5uh2IC2LWTU9myv4yNeUfcDsWEAEsmIShrXyljB9rgu2neFeMG0i3Kwwtr9rZc2ZgWWDIJMXXeerYUlHLqQPuyomlefGwUl5w2gDc+20dlTZ3b4ZggZ8kkxOQWVVBdV2+D76ZVrp2cSnl1HW9ttK9pmfaxZBJisvKdwfcUa5mYlmUM6cXQ5B68aF1dpp0smYSYrH1H6BblIb2P3cPEtExEuCYjlczdh8g5UOZ2OCaIWTIJMZvySxkzMAFPhA2+m9b56hmDiIwQFmfa4o+m7SyZhJD6emXTviM2+G5OSnJ8DBee0o+X1+ZRU1fvdjgmSFkyCSG7D1ZSUeO1Lyuak3bN5FRKKmp4N7vQ7VBMkLJkEkIaBt9PtcF3c5LOG5HMgJ6xvJhpA/GmbSyZhJCsfUeI9kQwoq/dw8ScHE+E8PWJg/hwWxH7Dh91OxwThCyZhJBN+aWM6h9PdKT9s5qT9/UM3/3m/mkD8aYN7FMnRKgqWfuO2PdLTJulJnVn6rA+LM7cS329Lf5oTo4lkxCRf/gohytrGWOD76YdrpmUSv7ho6zILXY7FBNkLJmEiE37fPd8H2vTgk07zDi1H4ndo2zxR3PSLJmEiE35R/BECKcMsGRi2i4m0sNVE1JYtqmQgxU1bodjgkibk4mIjBKRDX6PUhH5oYj8QkTy/cov8dvnXhHJEZGtInKxX/lMpyxHRO7xK08XkU+d8hdFJNopj3Fe5zjvp7X15wgVWftKGZ4cR2yUx+1QTJC7ZlIqNd56Xl1vd2E0rdfmZKKqW1V1vKqOByYClcCrztt/anhPVZcCiMgY4FrgVGAm8DcR8YiIB3gEmAWMAa5z6gL81jnWcOAQcLNTfjNwyCn/k1MvrGXlH7Hvl5gOMbp/AuNTE3lxjd2F0bReR3VzXQDkquruZurMBl5Q1WpV3QnkAJOdR46q7lDVGuAFYLb47ux0PvCSs/8i4Eq/Yy1ytl8CLpAwvhPUgdIqDpRVc6oNvpsOcs2kVLYVlrNh72G3QzFBoqOSybXA836vbxeRjSKyQER6OWUpgP+oXp5T1lR5b+CwqtadUH7csZz3jzj1w5INvpuOdvm4gXSP9vDsqj1uh2KCRLuTiTOOcQXwT6doPjAMGA8UAH9o7znaSkRuFZFMEcksKipyK4xOt2mfbxmVMZZMTAeJi4nka2cM4o3P9lFcXu12OCYIdETLZBawTlULAVS1UFW9qloPPI6vGwsgH0j122+QU9ZUeQmQKCKRJ5Qfdyzn/Z5O/eOo6mOqmqGqGcnJye3+QQNVVn4p6X16EB8b5XYoJoTMnTKEGm89L6y21olpWUckk+vw6+ISkQF+710FZDnbS4BrnZlY6cAIYDWwBhjhzNyKxtdltkR9I3/vA1c7+88FXvc71lxn+2rgPQ3jkcIsW3bedILhfeM5d0Qfnlm1m1qvLU1vmteuZCIiPYCLgFf8iv9PRD4XkY3AdOBOAFXdBCwGNgP/BuY5LZg64HbgbSAbWOzUBbgb+JGI5OAbE3nSKX8S6O2U/wg4Np043ByurCHv0FEbfDed4sapaRSWVvPvrP1uh2ICXGTLVZqmqhWcMPCtqt9qpv5DwEONlC9WYSUQAAAYnUlEQVQFljZSvoMvusn8y6uAr7ch5JBzbPDdpgWbTjBtZF+G9O7OwpW7uHzcQLfDMQHMvgEf5I7dw8RaJqYTREQIc89OY+3uQ2zMs2nCpmmWTILcpn2lpCR2I6lHtNuhmBB1dcYgekR7WLhyl9uhmABmySTI2eC76WwJsVFcPXEQb35WQFGZTRM2jbNkEsTKq+vYWVzB2BTr4jKd64YpadR463nepgmbJlgyCWLZBaWoYi0T0+mGJcfxlZHJPLtqNzV1Nk3YfJklkyDWMPhuLRPTFW6cmsaBsmr+lVXgdigmAFkyCWJZ+aX0iYuhb3yM26GYMHDeiGSG9ulhA/GmUZZMgtgm557vYbxgsulCERHC3ClprN9z2FYTNl9iySRIVdV62X6gnLH2/RLThb42cRBxMZEsXLHT7VBMgLFkEqSy8o/grVcbLzFdKi4mkq9nDOKtzws4UFrldjgmgFgyCVKf5PoWST4zPcnlSEy4mXt2GnX1yj8+tWnC5guWTILUytwSxgxIoJd98910sbQ+PZg+qi//+HQP1XVet8MxAcKSSRCqqvWyds8hpgwL25tLGpfdfE46xeXVLF6zt+XKJixYMglC6/YcoqaunrMtmRiXTBnWm4whvXjk/Vyqaq11YiyZBKVPckvwRAiTbbzEuEREuPOikewvreJFa50YLJkEpZW5JZyW0tNu02tcNWVYbyanJfG3D3KsdWIsmQSb8uo6Ptt72MZLjOsaWieFpdU8ZzO7wp4lkyCzZtdB6uqVKcP6uB2KMZw9rDdnDU1i/oc2dhLu2p1MRGSXc8/3DSKS6ZQlicgyEdnuPPdyykVEHhaRHBHZKCJn+B1nrlN/u4jM9Suf6Bw/x9lXmjtHqPskt4RoTwQTh4TFj2uCwJ0XjqSorJpnV+12OxTjoo5qmUxX1fGqmuG8vgd4V1VHAO86rwFmASOcx63AfPAlBuB+4Ex893y/3y85zAe+47ffzBbOEdI+yS1h/OBEukV73A7FGADOHNqbKcN68+iHuVTW1LkdjnFJZ3VzzQYWOduLgCv9yp9Wn1VAoogMAC4GlqnqQVU9BCwDZjrvJajqKlVV4OkTjtXYOULWkcpasvYdsfESE3DuvGgkxeU11joJYx2RTBR4R0TWisitTlk/VW246cF+oJ+znQL4zyPMc8qaK89rpLy5c4SsVTtLUMXGS0zAmZSWxLkj+vD3D3dY6yRMdUQyOUdVz8DXhTVPRM7zf9NpUWgHnKdJTZ1DRG4VkUwRySwqKurMELrEJ7klxEZFMD410e1QjPmSH144kpKKGp7+xFon4ajdyURV853nA8Cr+MY8Cp0uKpznA071fCDVb/dBTllz5YMaKaeZc/jH9piqZqhqRnJycnt+zICwMreYSWlJREfaJDwTeCYO6cV5I5P5+4e5lFdb6yTctOtTSUR6iEh8wzYwA8gClgANM7LmAq8720uAG5xZXWcBR5yuqreBGSLSyxl4nwG87bxXKiJnObO4bjjhWI2dIyQVlVWzrbDcurhMQLvzwhEcqqxlkd2NMexEtnP/fsCrzmzdSOA5Vf23iKwBFovIzcBuYI5TfylwCZADVAI3AqjqQRH5FbDGqfdLVT3obH8fWAh0A/7lPAB+08Q5QtKqHb4l5209LhPIJgzuxfRRyTz+0Q5uOHuIrdIQRtqVTFR1BzCukfIS4IJGyhWY18SxFgALGinPBMa29hyhamVuCfExkYwdmOB2KMY064cXjmT2IytYuGIXP7hghNvhmC5ine9B4pPcYs4cmkSkx/7JTGAbl5rIhaf047HlOygpr3Y7HNNF7JMpCOQfPsqukkrOtvESEyTumTWaylovf/7PdrdDMV3EkkkQaLhFr31Z0QSL4X3juP7MwTy3eg/bC8vcDsd0AUsmQWBlbjFJPaIZ1S/e7VCMabU7LhxJ92gP/7s02+1QTBewZBLgVJVVuSWcNTSJiAhxOxxjWi2pRzQ/OH84728tYvm24P/SsGmeJZMAt7ukkn1Hqmy8xASluVPSSE3qxkNvZeOt79SFMIzLLJkEuJU2XmKCWEykh3tnncLWwjIWZ9rtfUOZJZMAtzK3mH4JMQzt08PtUIxpk1lj+5MxpBd/eGerLbMSwiyZBDBV5ZPcEqYM64OzyoAxQUdE+PllYygur2H+Bzluh2M6iSWTALatsJySihpbQsUEvfGpiVw5fiCPf7STvEOVbodjOoElkwD2SW4xAGcPtWRigt9PZo5GgN+9vdXtUEwnsGQSwFbklpCa1I3UpO5uh2JMu6UkduM75w7l9Q37WL/nkNvhmA5mySRAlVbV8uG2Is4f1dftUIzpMLdNG0afuBgefCsb37qvJlRYMglQ//q8gJq6eq46Y1DLlY0JEnExkdw1YyRrdx/itQ35Le9ggoYlkwD1yrp8hvbpwbhBPd0OxZgO9fWMVCYMTuQXSzZzoLTK7XBMB7FkEoDyDx/l050HuXJCik0JNiHHEyH8/uvjqKr18t+vZll3V4iwZBKAXlvva/5fNSHF5UiM6RzDkuO4a8Yo/pNdaN1dIcKSSYBRVV5dn8+ktF42i8uEtJvOSWfikF7c//omCq27K+i1OZmISKqIvC8im0Vkk4jc4ZT/QkTyRWSD87jEb597RSRHRLaKyMV+5TOdshwRucevPF1EPnXKXxSRaKc8xnmd47yf1tafI9Bs2ldKzoFyrppgA+8mtHkihN9dfTo13nrufeVz6+4Kcu1pmdQBP1bVMcBZwDwRGeO89ydVHe88lgI4710LnArMBP4mIh4R8QCPALOAMcB1fsf5rXOs4cAh4Gan/GbgkFP+J6deSHhlXT7RngguPW2A26EY0+mGJsfxk4tH896WA7y8zrq7glmbk4mqFqjqOme7DMgGmuvknw28oKrVqroTyAEmO48cVd2hqjXAC8Bs8Y08nw+85Oy/CLjS71iLnO2XgAskBEaq67z1LPlsH+eP7kvP7lFuh2NMl7hxShqT05J44I1N7D9i3V3BqkPGTJxupgnAp07R7SKyUUQWiEgvpywF8F+DOs8pa6q8N3BYVetOKD/uWM77R5z6Qe3jnGKKy6u56gwbeDfhIyJC+L+rT6fOq9zzykbr7gpS7U4mIhIHvAz8UFVLgfnAMGA8UAD8ob3naEdst4pIpohkFhUF/p3eXl2fT89uUUwblex2KMZ0qbQ+Pbh75ig+2FrEPzPz3A7HtEG7komIROFLJP9Q1VcAVLVQVb2qWg88jq8bCyAfSPXbfZBT1lR5CZAoIpEnlB93LOf9nk7946jqY6qaoaoZycmB/QFdXl3H25v2c9npA4iJ9LgdjjFd7oaz0zgzPYlfvbmZfYePuh2OOUntmc0lwJNAtqr+0a/cf+T4KiDL2V4CXOvMxEoHRgCrgTXACGfmVjS+Qfol6mvrvg9c7ew/F3jd71hzne2rgfc0yNvGb2ftp6q2nq9aF5cJUxERwu+uHodXlZ++tNFu8xtk2tMymQp8Czj/hGnA/ycin4vIRmA6cCeAqm4CFgObgX8D85wWTB1wO/A2vkH8xU5dgLuBH4lIDr4xkSed8ieB3k75j4Bj04mD1avr8xmc1J0zBvdqubIxIWpw7+7cf/kYPs4p5oE3Ntn4SRCJbLlK41T1Y6CxGVRLm9nnIeChRsqXNrafqu7gi24y//Iq4OsnE28gKyytYkVuMT84f4Qtn2LC3jWTBrOjqIK/L99BSmI3vvuVYW6HZFqhzcnEdJzXN+SjasunGNPg7pmjyT98lF//awsDErtxxbiBbodkWmDJJAC8si6f8amJpPfp4XYoxgSECGcxyANl1dy1+DP6xsdwlt1xNKDZ2lwuyy4oZcv+Mht4N+YEsVEeHvvWRFKTunHr05nkHChzOyTTDEsmLnttfT6REcJlp1sz3pgTJXaPZuGNk4mJ8jB3wRq7/0kAs2TiIm+98tqGfKaN6ktSj2i3wzEmIKUmdWfB3EkcqqzhpkVrqKiua3kn0+Usmbho+fYiCkurbeDdmBacNqgnj3zjDLILypj33DrqvPVuh2ROYMnEJRXVddz/+iYGJ3XnglP6uh2OMQFv+ui+PHjlWD7YWsS859ZZCyXAWDJxya//lc3eQ5X8/uvjiI2y5VOMaY3rJg/mvsvGsGxzIV+bv5K9ByvdDsk4LJm4YPm2Ip5dtYdbzklncnqS2+EYE1RuPiedhTdOZt/ho1z+149ZkVPsdkgGSyZd7sjRWn760kaG943jxzNGuR2OMUHpvJHJLLn9HJLjYrhhwWoWfLzTll5xmSWTLvbAG5soKq/mj3Ose8uY9kjr04NX503lgtF9+eWbm7nrnxupqvW6HVbYsmTShd7etJ9X1uUzb/pwTh+U6HY4xgS9uJhIHr1+IndcMIKX1+VxzWOrKLTvorjCkkkXKSmv5mevfs6pAxO4ffpwt8MxJmRERAh3XjSSR6+fyPbCMi59+GNeWptnS9h3MUsmXUBV+dmrWZQereMPc8YRHWmX3ZiONnNsf179/lQGJsZy1z8/45K/fMS72YU2ltJF7FOtCyz5bB//3rSfOy8ayej+CW6HY0zIGtU/ntfnTeWRb5xBjbeemxdlMufvn7B290G3Qwt5lkw6WWFpFfe9lsUZgxO59byhbodjTMgTES49fQDv3HkeD101ll0llXxt/ifcsiiTbYW2WGRnsSXoO1FRWTV3vriBGm89f5gzHk+E3fjKmK4S5Yngm2cO4aoJKTy1YhePfpDLzD8vZ+bY/swaO4Bpo5KJj41yO8yQYcmkE5RW1fL48h08+fFOquvq+fVVp9m9SoxxSffoSOZNH851kwcz/4McXlmXz9LP9xPlEaYM68PFp/bnwjF96Rsf63aoQU2CeXBKRGYCfwE8wBOq+pum6mZkZGhmZmanxlNV6+XZVbt55P0cDlXWcvm4gfzoopGWSIwJIN56Zd2eQ7yzaT9vbypkz8FKRGBCaiIzTu3PhNRERvdPoGd3a7UAiMhaVc1osV6wJhMR8QDbgIuAPGANcJ2qbm6sfmcmkzpvPa+sz+fPy7ax70gV541M5qcXj2JsSs9OOZ8xpmOoKlsLy3hnUyHvbN5PVn7psff6JcQwqn8Co/rFMbJfPKP7JzA0uQc9YsKrQ6e1ySSYr8pkIEdVdwCIyAvAbKDRZNJeqkpFjZeismqKy6spdp6LyqpZmrWfnAPljEtN5PdzxjFlWJ/OCMEY08FEhNH9ExjdP4H/umAEB0qr2FxQytb9ZWwtLGPr/jIW7Sihpu6LJe+7RXlI6hFNn7hoknpE0zsuht49fNvxsVF0j/bQLdpDtygP3aM9xDrP3aI9REZEEOURojwRRHqEqIgIIpoYS1VVvPWKVxVV3xhQII+7BnMySQH2+r3OA85sqvKOogque2wVAMrxrTFV36Ouvh5vvVJXr8c919TVU1JRTVXtl++hIAKj+sXz6PVncPGp/REJ3H9sY0zz+ibE0jchlmmjvrgthLde2V1Swdb9ZewsqeBgeQ0HK2ooqaihqLyarfvLKK6oOS7hnIwIgUhPBBEC9fXgVaXeSSAnivIIsZEeYqI8dIuOIDbSl6yiIyNo+ORp+AgSjm3QFZ9KwZxMWiQitwK3AvQYMOz4b8SecHU9EUJMVCSeCCEyQnzPnggiI4TIiAiSekSRHB9Dnzi/R3w0Sd2jifTYDGtjQpUnQhiaHMfQ5Lgm6zT0XJRX1VFZU8fRWi9Ha7xU1niPbR+t9VLnrafGq9R566mrV2q99dR666nz+hJIRITgEd/nT4Tz7IkQRKC2Tqmq8x2rus5LVW09VbVeqmq91Dg3C2tIQMeeaTwptdbJ7BrMySQfSPV7PcgpO0ZVHwMeA9+YyeLbzu666IwxYUNEiIuJJC4Ex1PkttbVC+Y/qdcAI0QkXUSigWuBJS7HZIwxYSlo06iq1onI7cDb+KYGL1DVTS6HZYwxYSlokwmAqi4FlrodhzHGhLtg7uYyxhgTICyZGGOMaTdLJsYYY9rNkokxxph2s2RijDGm3YJ2oceTJSJlwFa342hEH6DY7SAaYXGdvECNzeI6ORbX8YaoanJLlYJ6avBJ2tqalS+7mohkWlytF6hxQeDGZnGdHIurbaybyxhjTLtZMjHGGNNu4ZRMHnM7gCZYXCcnUOOCwI3N4jo5FlcbhM0AvDHGmM4TTi0TY4wxnSQkk4mILBCRAyKS5VeWJCLLRGS789wrQOL6hYjki8gG53GJC3Glisj7IrJZRDaJyB1OuavXrJm4XL1mIhIrIqtF5DMnrgec8nQR+VREckTkRefWCIEQ10IR2el3vcZ3ZVx+8XlEZL2IvOm8dvV6NROX69dLRHaJyOfO+TOdMtc/w5oTkskEWAjMPKHsHuBdVR0BvOu87moL+XJcAH9S1fHOw41VkOuAH6vqGOAsYJ6IjMH9a9ZUXODuNasGzlfVccB4YKaInAX81olrOHAIuDlA4gL4id/12tDFcTW4A8j2e+329WpwYlwQGNdrunP+hunAbv8+Niskk4mqLgcOnlA8G1jkbC8CruzSoGgyLtepaoGqrnO2y/D9YqXg8jVrJi5XqU+58zLKeShwPvCSU+7G9WoqLteJyCDgUuAJ57Xg8vVqLK4A5/pnWHNCMpk0oZ+qFjjb+4F+bgZzgttFZKPTDeZq01VE0oAJwKcE0DU7IS5w+Zo5XSMbgAPAMiAXOKyqdU6VPFxIfCfGpaoN1+sh53r9SURiujou4M/AT4F653VvAuB6NRJXA7evlwLviMhaEbnVKQuY38fGhFMyOUZ9U9gC4i82YD4wDF+3RAHwB7cCEZE44GXgh6pa6v+em9eskbhcv2aq6lXV8cAgYDIwuqtjaMyJcYnIWOBefPFNApKAu7syJhG5DDigqmu78rwtaSYuV6+X4xxVPQOYha979zz/NwPsMwwIr2RSKCIDAJznAy7HA4CqFjofAPXA4/g+mLqciETh+8D+h6q+4hS7fs0aiytQrpkTy2HgfeBsIFFEGpYoGgTkB0BcM53uQlXVauApuv56TQWuEJFdwAv4urf+gvvX60txicizAXC9UNV85/kA8KoTg+u/j80Jp2SyBJjrbM8FXncxlmMa/nM4rgKymqrbiTEI8CSQrap/9HvL1WvWVFxuXzMRSRaRRGe7G3ARvvGc94GrnWpuXK/G4tri9wEk+PrZu/R6qeq9qjpIVdOAa4H3VPWbuHy9mojrerevl4j0EJH4hm1ghhNDQH6GHaOqIfcAnsfX/VGLry/2Znx9tO8C24H/AEkBEtczwOfARnz/WQa4ENc5+JrMG4ENzuMSt69ZM3G5es2A04H1zvmzgP9xyocCq4Ec4J9ATIDE9Z5zvbKAZ4G4rv4/5hfjNODNQLhezcTl6vVyrstnzmMT8DOn3PXPsOYe9g14Y4wx7RZO3VzGGGM6iSUTY4wx7WbJxBhjTLtZMjHGGNNulkyMMca0myUTY4wx7WbJxJgOJCLTRGSK3+uFInJ1c/sYEwosmRjTsaYBU1qq1BriY7+jJijYf1RjTiAiaSKyxWlVbBORf4jIhSKywrkx0WTnRkWvOSvLrhKR052VjW8D7nRuanSuc8jzRGSliOzwb6WIyE9EZI1zjAf8zr1VRJ7G9w3s1CZinC8imeJ3Eyyn/BIn9rUi8rDfDZ96OCssrxbfjaBmd8rFM2HLvgFvzAmcpJCDb8n7TcAafEtb3AxcAdwI7AWKVfUBETkf+KOqjheRXwDlqvp751gLgR7ANfhWol2iqsNFZAa+dam+Cwi+ZWH+D9gD7ACmqOqqZmJMUtWDIuLBt8TGfwHb8C21cZ6q7hSR54F4Vb1MRP4X2Kyqzzrrd60GJqhqRYdcNBP2IluuYkxY2qmqnwOIyCZ8d7hTEfkcSAOGAF8DUNX3RKS3iCQ0cazX1LfC8WYRabgHxQznsd55HQeMwJdMdjeXSBxznPtcRAIDgDH4ehp2qOpOp87zQMO9MGbgWyH3Lud1LDCYL99h0Jg2sWRiTOOq/bbr/V7X4/u9qW3jscTv+deq+nf/ik6rqNnWgoikA3cBk1T1kNP6iW0hBgG+pqpbWx+2Ma1nYybGtM1HwDfBN4MLX5dXKVAGxLdi/7eBm5wbfyEiKSLSt5XnTsCXcI44LZ1ZTvlWYKiTkMDXteZ/vh84y6ojIhNaeS5jWsVaJsa0zS+ABSKyEajki/tMvAG85Axw/6CpnVX1HRE5BfjE+XwvB64HvC2dWFU/E5H1wBZ8YzcrnPKjIvJ94N8iUoFvrKfBr/DdonajM0NsJ3BZ639cY5pnA/DGhBARiVPVcqcF8giwXVX/5HZcJvRZN5cxoeU7IrIB3yy0nsDfW6hvTIewlokxAUxEPgViTij+VsNMM2MChSUTY4wx7WbdXMYYY9rNkokxxph2s2RijDGm3SyZGGOMaTdLJsYYY9rt/wNHpuVhEALuBgAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.plot(x='mother_age', y='num_babies');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at the data after applying different forms of scaling."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_sql = \"\"\"\n",
    "CREATE TEMPORARY FUNCTION CLIP_LESS(x FLOAT64, a FLOAT64) AS (\n",
    "  IF (x < a, a, x)\n",
    ");\n",
    "CREATE TEMPORARY FUNCTION CLIP_GT(x FLOAT64, b FLOAT64) AS (\n",
    "  IF (x > b, b, x)\n",
    ");\n",
    "CREATE TEMPORARY FUNCTION CLIP(x FLOAT64, a FLOAT64, b FLOAT64) AS (\n",
    "  CLIP_GT(CLIP_LESS(x, a), b)\n",
    ");\n",
    "\n",
    "WITH stats AS (\n",
    "    SELECT\n",
    "      MIN(mother_age) AS min_age,\n",
    "      MAX(mother_age) AS max_age,\n",
    "      AVG(mother_age) AS avg_age,\n",
    "      STDDEV(mother_age) AS stddev_age,\n",
    "      APPROX_QUANTILES(mother_age, 100)[OFFSET(1)] AS percentile_1,\n",
    "      APPROX_QUANTILES(mother_age, 100)[OFFSET(99)] AS percentile_99\n",
    "    FROM\n",
    "      publicdata.samples.natality\n",
    "    WHERE\n",
    "      year > 2000\n",
    "),\n",
    "\n",
    "scaling AS (\n",
    "    SELECT\n",
    "      mother_age,\n",
    "      weight_pounds,\n",
    "      SAFE_DIVIDE(2*mother_age - max_age - min_age, max_age - min_age) AS minmax_scaled,\n",
    "      CLIP( (mother_age - 30)/15, -1, 1 ) AS clipped,\n",
    "      SAFE_DIVIDE(mother_age - avg_age, stddev_age) AS zscore,\n",
    "      CLIP(mother_age, percentile_1, percentile_99) AS winsorized_1_99,\n",
    "      SAFE_DIVIDE(2*CLIP(mother_age, percentile_1, percentile_99) - percentile_1 - percentile_99, percentile_99 - percentile_1) AS winsorized_scaled\n",
    "    FROM\n",
    "      publicdata.samples.natality, stats\n",
    ")\n",
    "\"\"\"\n",
    "\n",
    "def scaled_stats(age_col):\n",
    "    sql = base_sql + \"\"\"\n",
    "SELECT\n",
    "   {0},\n",
    "   AVG(weight_pounds) AS avg_wt,\n",
    "   COUNT(1) AS num_babies\n",
    "FROM\n",
    "   scaling\n",
    "GROUP BY {0}\n",
    "ORDER BY {0}\n",
    "    \"\"\".format(age_col)\n",
    "    from google.cloud import bigquery\n",
    "    return bigquery.Client().query(sql).to_dataframe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.rcParams['figure.figsize'] = [15, 15]\n",
    "plt.rcParams.update({'font.size': 15})\n",
    "fig, axs = plt.subplots(3, 2);\n",
    "scaled_stats('mother_age').plot(x='mother_age', y='num_babies', ax=axs[0, 0]);\n",
    "scaled_stats('minmax_scaled').plot(x='minmax_scaled', y='num_babies', ax=axs[0, 1]);\n",
    "scaled_stats('clipped').plot(x='clipped', y='num_babies', ax=axs[1, 0]);\n",
    "scaled_stats('zscore').plot(x='zscore', y='num_babies', ax=axs[1, 1], xlim=[-2, 2]);\n",
    "scaled_stats('winsorized_1_99').plot(x='winsorized_1_99', y='num_babies', ax=axs[2, 0]);\n",
    "scaled_stats('winsorized_scaled').plot(x='winsorized_scaled', y='num_babies', ax=axs[2, 1]);\n",
    "fig.savefig('scaling.png')\n",
    "plt.close(fig)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Skewed data\n",
    "\n",
    "For an example of highly skewed data, assume that we are building a model to predict the likely sales of a non-fiction book. One of the inputs to the model is the popularity of the Wikipedia page corresponding to the topic. The number of views of pages in Wikipedia is highly skewed.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bigquery df\n",
    "WITH bypage AS (\n",
    "SELECT \n",
    "  title,\n",
    "  SUM(views) AS num_views\n",
    "FROM `bigquery-samples.wikipedia_benchmark.Wiki1M`\n",
    "WHERE language = 'en'\n",
    "GROUP BY title\n",
    "HAVING num_views > 10 # non-niche\n",
    "ORDER by num_views desc\n",
    "),\n",
    "\n",
    "percentile AS (\n",
    "SELECT\n",
    "  APPROX_QUANTILES(num_views, 100) AS bins\n",
    "FROM\n",
    "  bypage\n",
    ")\n",
    "\n",
    "SELECT \n",
    "  title,\n",
    "  num_views,\n",
    "  (ROUND(POW(LOG(num_views), 0.25), 1) - 1.3) AS fourthroot_log_views,\n",
    "  CAST(REPLACE(ML.BUCKETIZE(num_views, bins), 'bin_', '') AS int64) AS bin,\n",
    "FROM\n",
    "  percentile, bypage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy import stats\n",
    "data, est_lambda = stats.boxcox(df['num_views'])\n",
    "df['boxcox'] = data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>num_views</th>\n",
       "      <th>fourthroot_log_views</th>\n",
       "      <th>bin</th>\n",
       "      <th>boxcox</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Script_kiddie</td>\n",
       "      <td>92485</td>\n",
       "      <td>0.5</td>\n",
       "      <td>66</td>\n",
       "      <td>1.836813</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ENB</td>\n",
       "      <td>11</td>\n",
       "      <td>-0.1</td>\n",
       "      <td>2</td>\n",
       "      <td>1.340333</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>File:Immortel_(ad_vitam)_movie_poster.jpeg</td>\n",
       "      <td>11</td>\n",
       "      <td>-0.1</td>\n",
       "      <td>2</td>\n",
       "      <td>1.340333</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Watchman_Fellowship</td>\n",
       "      <td>11</td>\n",
       "      <td>-0.1</td>\n",
       "      <td>2</td>\n",
       "      <td>1.340333</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>OREC</td>\n",
       "      <td>11</td>\n",
       "      <td>-0.1</td>\n",
       "      <td>2</td>\n",
       "      <td>1.340333</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39725</th>\n",
       "      <td>Constructivism_(learning_theory)</td>\n",
       "      <td>1653</td>\n",
       "      <td>0.3</td>\n",
       "      <td>65</td>\n",
       "      <td>1.807664</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39726</th>\n",
       "      <td>Avril_lavigne</td>\n",
       "      <td>2108</td>\n",
       "      <td>0.4</td>\n",
       "      <td>65</td>\n",
       "      <td>1.811728</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39727</th>\n",
       "      <td>Boeing_767</td>\n",
       "      <td>2398</td>\n",
       "      <td>0.4</td>\n",
       "      <td>65</td>\n",
       "      <td>1.813674</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39728</th>\n",
       "      <td>LeToya_Luckett</td>\n",
       "      <td>1521</td>\n",
       "      <td>0.3</td>\n",
       "      <td>65</td>\n",
       "      <td>1.806145</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39729</th>\n",
       "      <td>File_Allocation_Table</td>\n",
       "      <td>2420</td>\n",
       "      <td>0.4</td>\n",
       "      <td>65</td>\n",
       "      <td>1.813807</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>39730 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                            title  num_views  \\\n",
       "0                                   Script_kiddie      92485   \n",
       "1                                             ENB         11   \n",
       "2      File:Immortel_(ad_vitam)_movie_poster.jpeg         11   \n",
       "3                             Watchman_Fellowship         11   \n",
       "4                                            OREC         11   \n",
       "...                                           ...        ...   \n",
       "39725            Constructivism_(learning_theory)       1653   \n",
       "39726                               Avril_lavigne       2108   \n",
       "39727                                  Boeing_767       2398   \n",
       "39728                              LeToya_Luckett       1521   \n",
       "39729                       File_Allocation_Table       2420   \n",
       "\n",
       "       fourthroot_log_views  bin    boxcox  \n",
       "0                       0.5   66  1.836813  \n",
       "1                      -0.1    2  1.340333  \n",
       "2                      -0.1    2  1.340333  \n",
       "3                      -0.1    2  1.340333  \n",
       "4                      -0.1    2  1.340333  \n",
       "...                     ...  ...       ...  \n",
       "39725                   0.3   65  1.807664  \n",
       "39726                   0.4   65  1.811728  \n",
       "39727                   0.4   65  1.813674  \n",
       "39728                   0.3   65  1.806145  \n",
       "39729                   0.4   65  1.813807  \n",
       "\n",
       "[39730 rows x 5 columns]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.rcParams['figure.figsize'] = [15, 10]\n",
    "plt.rcParams.update({'font.size': 15})\n",
    "fig, axs = plt.subplots(1, 4);\n",
    "\n",
    "for axno, name in enumerate('num_views,fourthroot_log_views,bin,boxcox'.split(',')):\n",
    "   df.hist(histtype='bar', bins=20, column=name, ax=axs[axno]);\n",
    "fig.savefig('skew_log.png')\n",
    "plt.close(fig)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright 2020 Google Inc. Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
